Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] Removing RRFS-related features. #893

Merged
merged 28 commits into from
Sep 14, 2023

Conversation

christinaholtNOAA
Copy link
Collaborator

DESCRIPTION OF CHANGES:

After recent high-level decisions to discontinue support of GSI, we have decided to discontinue support for any cycling with GSI in SRW. This PR aims to remove all the unnecessary components that have been added to the App consistent with those decisions.

Specifically, the following changes were made:

  • The external repositories for GSI and rrfs-utils were removed from Externals.cfg.
  • The build script options and CMake options for both GSI and rrfs-utils were removed.
  • The j-jobs and ex-scripts related to cycling were removed.
  • The parm/wflow settings for DA data and cycling were removed, along with any config files in ush and the test directories.
  • The GSI-required libraries were removed from the module files.
  • DA Obs entries were removed from the data locations file.
  • RRFS concepts related to cycling and restarting were removed from configuration options and scripts.

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

TESTS CONDUCTED:

  • hera.intel
  • orion.intel
  • cheyenne.intel
  • cheyenne.gnu
  • gaea.intel
  • jet.intel
  • wcoss2.intel
  • NOAA Cloud (indicate which platform)
  • Jenkins
  • fundamental test suite
  • comprehensive tests (specify which if a subset was used)
    It would be helpful if someone could test this branch to ensure it doesn't cause problems for AQM. I have not tested that capability.

DEPENDENCIES:

None.

DOCUMENTATION:

@gspetro-NOAA Let me know what makes sense in terms of updating docs.

ISSUE:

None.

CHECKLIST

  • My code follows the style guidelines in the Contributor's Guide
  • I have performed a self-review of my own code using the Code Reviewer's Guide
  • I have commented my code, particularly in hard-to-understand areas
  • My changes need updates to the documentation. I have made corresponding changes to the documentation
  • My changes do not require updates to the documentation (explain).
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • Any dependent changes have been merged and published

Copy link
Collaborator

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One required change and a few comments/questions. You will also need to add the following lines to the tests config.MET_ensemble_verification_only_vx.yaml and config.MET_verification_only_vx.yaml to ensure they pass:

nco:
  NET_default: 'rrfs'

ush/config_defaults.yaml Show resolved Hide resolved
scripts/exregional_run_prdgen_subpiece.sh Show resolved Hide resolved
parm/data_locations.yml Show resolved Hide resolved
scripts/exregional_run_prdgen.sh Show resolved Hide resolved
gspetro-NOAA added a commit to gspetro-NOAA/ufs-srweather-app that referenced this pull request Aug 31, 2023
Copy link
Collaborator Author

@christinaholtNOAA christinaholtNOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkavulich Thanks for the review. I've addressed a bunch of your changes locally, and have made a few others @gspetro-NOAA found while updating the docs. I will test those changes and then push.

parm/data_locations.yml Show resolved Hide resolved
scripts/exregional_run_prdgen.sh Show resolved Hide resolved
@christinaholtNOAA
Copy link
Collaborator Author

I've pushed my tested changes here, and am running a final round of the fundamental tests after I merged develop into my aging branch. I'll report back on those results shortly.

When I ran the comprehensive test suite on hera before I merged develop my only failure was in the GST_release_public_v1 test. I observed the same failure as documented in Issue #874.

Copy link
Collaborator

@mkavulich mkavulich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing my comments!

@natalie-perlin
Copy link
Collaborator

Could Daniel Abdi's #540 be closed if RRFS-related features are removed at this point?
#540

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA - Thanks for addressing my concerns! All of the fundamental tests are successfully passing now:

----------------------------------------------------------------------------------------------------
Experiment name                                                  | Status    | Core hours used 
----------------------------------------------------------------------------------------------------
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_RAP_suite_RRFS_v1beta    COMPLETE               8.31
nco_grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_timeoffset_suite_  COMPLETE              11.15
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v15p2        COMPLETE               6.43
grid_RRFS_CONUS_25km_ics_FV3GFS_lbcs_FV3GFS_suite_GFS_v17_p8_plot  COMPLETE              12.51
grid_RRFS_CONUScompact_25km_ics_HRRR_lbcs_HRRR_suite_HRRR          COMPLETE              25.76
grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_RAP_suite_WoFS_v0              COMPLETE              13.20
grid_RRFS_CONUS_25km_ics_NAM_lbcs_NAM_suite_GFS_v16                COMPLETE              18.68
----------------------------------------------------------------------------------------------------
Total                                                              COMPLETE              96.04

Approving now.

@MichaelLueken MichaelLueken added the run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests label Sep 11, 2023
@christinaholtNOAA
Copy link
Collaborator Author

@MichaelLueken Thanks for the follow up. I am seeing the same on Hera when I checked back in on my latest run this morning.

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA - Unfortunately, the Jet coverage tests failed because the process_obs WE2E test is still present in tests/WE2E/machine_suites/coverage.jet. Please remove this test, then I can queue up the automated tests on Jet. Thanks!

@MichaelLueken
Copy link
Collaborator

@christinaholtNOAA - It looks like there are also GSI related changes that should be removed in modulefiles/build_gaea_intel.lua. It looks like lines 19-24 can be removed. Looking at the changes that @natalie-perlin has made in PR #889, these lines have been removed.

@natalie-perlin - is it safe for @christinaholtNOAA to remove these lines as part of her removal of RRFS-related features in this PR? Thanks!

Copy link
Collaborator

@MichaelLueken MichaelLueken left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, @christinaholtNOAA! Reapproving now and resubmitting Gaea and Jet tests.

@@ -20,7 +20,6 @@ local MKLROOT="/opt/intel/oneapi/mkl/2022.0.2/"
prepend_path("LD_LIBRARY_PATH",pathJoin(MKLROOT,"lib/intel64"))
pushenv("MKLROOT", MKLROOT)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@christinaholtNOAA ,
all these could be removed, too, as they were needed for GSI only:

prepend_path("LD_LIBRARY_PATH",pathJoin(MKLROOT,"lib/intel64"))
pushenv("MKLROOT", MKLROOT)
pushenv("CRAYPE_LINK_TYPE","dynamic")

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, I could take care of these when updating the modulefiles #889 , as it may be scheduled to be merged after the present one (993)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natalie-perlin I can go ahead and remove those here.

@MichaelLueken
Copy link
Collaborator

@christinaholtNOAA - The rerun of the Jenkins tests on Gaea successfully passed. The Jet tests that require pulling data from HPSS failed to pull data from HPSS with the following errors:

ERROR: htar_BlockRead: Error 5 reading 288 bytes from HPSS-resident file: 
ERROR: htar_ReadIndexHeader: Error -1 reading HPSS-resident index file header: 
HTAR: HTAR FAILED

Resubmitting the tests this morning to see if they will pass. Once successful, I can merge this work.

@MichaelLueken
Copy link
Collaborator

Just as a head's up, according to the Jet helpdesk, the issue with accessing Jet via the Princeton bastion is also causing issues with Jet accessing HPSS.

@MichaelLueken
Copy link
Collaborator

The Jenkins WE2E tests that previously failed on Jet, due to access issues to HPSS, have successfully passed. Will now move forward with merging this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
run_we2e_coverage_tests Run the coverage set of SRW end-to-end tests
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants